The Role of Cognates in Word Acquisition
PhD Defence / Departament de Medicina i Ciències de la Vida
2024-11-03
Average English-native 20-year-old knows ~42,000 words: mental lexicon
Lexical development:
Inter-modal experimental paradigms
(Bergelson and Swingley 2012, 2015; Tincoff and Jusczyk 1999, 2012)

Parental reports and surveys
e.g., MacArthur-Bates Communicative Development Inventory (CDI) (Fenson et al. 1994)

Vocabulary size grows non-linearly during the second year of life
Figure 1: Vocabulary size norms for 51,800 monolingual children learning 35 distinct languages (wordbank)
Cascaded activation: activation spreads across non-selected lexical representations (Allopenna, Magnuson, and Tanenhaus 1998)
In children (Chow, Davies, and Plunkett 2017)
Yet, bilinguals keep up with monolinguals (Sebastian-Galles and Santolin 2020)
Bilinguals acquire words at similar rates as monolinguals (Hoff et al. 2012)
Lexically closer languages ➡️ Larger vocabulary size (Floccia et al. 2018)
English-Dutch > English-Mandarin
Cognate: form-similar translation equivalents (TEs)
| Cognate | Non-cognate |
|---|---|
| [cat] /ˈgat-ˈgato/ | [dog] /ˈgos-ˈpe.ro/ |
Cognates acquired earlier than non-cognates (Mitchell, Tsui, and Byers-Heinlein 2023; Bosch and Ramon-Casas 2014)
Figure 2: Pairwise lexical similarity (average Levensthein similarity across translations).
What mechanisms support a cognate facilitation during word acquisition?
Language non-selective lexical access
Cognate beginnings to lexical acquisition
Activation spreads across non-selected representations in both languages, through phonological and conceptual links. (e.g., Costa, Caramazza, and Sebastian-Galles 2000)
Evidence in children (Bosma and Nota 2020; De Houwer, Bornstein, and Putnick 2014) and infants (Von Holzen and Mani 2012; Jardak and Byers-Heinlein 2019; Singh 2014).
Accumulator Model of Bilingual Lexical Acquisition (AMBLA)
\[ \begin{aligned} \definecolor{myred}{RGB}{ 168, 0, 53 } \definecolor{myblue}{RGB}{ 0, 64, 168 } \definecolor{mygreen}{RGB}{0, 168, 87} \definecolor{grey}{RGB}{128, 128, 128} \textbf{For participant } &i \textbf{ and word-form } j \text{ (translation of } j'): \\ {\color{mygreen}\text{Age of Acquisition}_{ij}} &= \{\text{Age}_i \mid {\color{myred}\text{Learning instances}_{ij}} = {\color{myblue}\text{Threshold}} \}\\ \color{myred}{\text{Learning instances}_{ij}} &= \text{Age}_i \cdot \text{Freq}_j \\ \textbf{where:} \\ {\color{myblue}\text{Threshold}} &= 300 \\ \text{Freq}_j &\sim \text{Poisson}(\lambda = 50) \end{aligned} \]
Extended to the bilingual case:
Exposure: proportion of time exposed to the language of \(j\) word
Accumulation of learning instances, a function of Exposure and Frequency.
\[ \begin{aligned} \definecolor{myred}{RGB}{ 168, 0, 53 } \definecolor{myblue}{RGB}{ 0, 64, 168 } \definecolor{mygreen}{RGB}{0, 168, 87} \definecolor{myorange}{RGB}{ 235, 127, 26 } \textbf{For participant } &i \textbf{ and word-form } j \text{ (translation of } j'): \\ \text{Age of Acquisition}_{ij} &= \{\text{Age}_i \mid \text{Learning instances}_{ij} = \text{Threshold} \}\\ \text{Learning instances}_{ij} &= \text{Age}_i \cdot \text{Freq}_j \cdot {\color{myred}\text{Exposure}_{ij}}\\ \textbf{where:} \\ \text{Threshold} &= 300 \\ \text{Freq}_j &\sim \text{Poisson}(\lambda = 50) \end{aligned} \]
Implementing a cognateness facilitation mechanism:
Degree proportional to their phonological similarity (Cognateness)
\[ \begin{aligned} \definecolor{myred}{RGB}{ 168, 0, 53 } \definecolor{myblue}{RGB}{ 0, 64, 168 } \definecolor{mygreen}{RGB}{0, 168, 87} \definecolor{myorange}{RGB}{ 235, 127, 26 } \textbf{For participant } &i \textbf{ and word-form } j \text{ (translation of } j'): \\ \text{Age of Acquisition}_{ij} &= \{\text{Age}_i \mid \text{Learning instances}_{ij} = \text{Threshold} \}\\ \text{Learning instances}_{ij} &= \text{Age}_i \cdot \text{Freq}_j \cdot \text{Exposure}_{ij} \cdot {\color{myred}\text{Cognateness}_{j}}\\ \textbf{where:} \\ \text{Threshold} &= 300 \\ \text{Freq}_j &\sim \text{Poisson}(\lambda = 50) \\ {\color{myred}\text{Cognateness}}&{\color{myred}=\text{Levenshtein}(j, j')} \end{aligned} \]
Ordinal, multilevel regression model
\[ \begin{aligned} \text{Exposure}_{ij} &= \text{Frequency}_j \times \text{Language degree of exposure}_{ij} \\ \text{Cognateness}_{j} &= \text{Levenshtein}(j, j') \end{aligned} \]
Bayesian: \[ \text{Posterior} \propto \text{Prior} \times \text{Likelihood} \]
Figure 4: Posterior distribution of fixed regression coefficients
Figure 5: Posterior marginal effects
Only words from the lower exposure benefit from cognateness Parallel to language dominance effects in adults?
Is language-non selectivity already present?
Developmental trajectories of bilingual spoken word recognition
Cross-language phonological priming:
Von Holzen and Mani (2012): German-English (N = 20, 21-43 months), priming through translation paradigm
Mani and Plunkett (2010): English monolinguals, implicit priming
N = 112 children (15 longitudinal)
Average age 26.36 months (SD = 4.01, Range = 20.03–32.5)
English monolinguals, Oxford (United Kindgom)
Figure 6: Participant receptive vocabulary sizes across ages and language profiles.
N = 112 children (15 longitudinal)
Average age 26.36 months (SD = 4.01, Range = 20.03–32.5)
English monolinguals, Oxford (United Kindgom)
Figure 7: Participant receptive vocabulary sizes across ages and language profiles.
Thanks